CCMine: Efficient Mining of Confidence-Closed Correlated Patterns

نویسندگان

  • Won-Young Kim
  • Young-Koo Lee
  • Jiawei Han
چکیده

Correlated pattern mining has become increasingly important recently as an alternative or an augmentation of association rule mining. Though correlated pattern mining discloses the correlation relationships among data objects and reduces significantly the number of patterns produced by the association mining, it still generates quite a large number of patterns. In this paper, we propose closed correlated pattern mining to reduce the number of the correlated patterns produced without information loss. We first propose a new notion of the confidence-closed correlated patterns, and then present an efficient algorithm, called CCMine, for mining those patterns. Our performance study shows that confidenceclosed pattern mining reduces the number of patterns by at least an order of magnitude. It also shows that CCMine outperforms a simple method making use of the the traditional closed pattern miner. We conclude that confidence-closed pattern mining is a valuable approach to condensing correlated patterns.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MARBLES: Mining Association Rules Buried in Long Event Sequences

Sequential pattern discovery is a well-studied field in data mining. Episodes are sequential patterns that describe events that often occur in the vicinity of each other. Episodes can impose restrictions on the order of the events, which makes them a versatile technique for describing complex patterns in the sequence. Most of the research on episodes deals with special cases such as serial and ...

متن کامل

Mining Top-K Frequent Closed Patterns without Minimum Support

In this paper, we propose a new mining task: mining top-k frequent closed patterns of length no less than min `, where k is the desired number of frequent closed patterns to be mined, and min ` is the minimal length of each pattern. An efficient algorithm, called TFP, is developed for mining such patterns without minimum support. Two methods, closed node count and descendant sum are proposed to...

متن کامل

Polynomial-Delay and Polynomial-Space Algorithms for Mining Closed Sequences, Graphs, and Pictures in Accessible Set Systems

In this paper, we study efficient closed pattern mining in a general framework of set systems, which are families of subsets ordered by set-inclusion with a certain structure, proposed by Boley, Horváth, Poigné, Wrobel (PKDD’07 and MLG’07). By modeling semi-structured data such as sequences, graphs, and pictures in a set system, we systematically study efficient mining of closed patterns. For a...

متن کامل

Efficiently Mining Closed Subsequences with Gap Constraints

Mining frequent subsequence patterns from sequence databases is a typical data mining problem and various efficient sequential pattern mining algorithms have been proposed. In many problem domains (e.g, biology), the frequent subsequences confined by the predefined gap requirements are more meaningful than the general sequential patterns. In this paper we re-examine the closed sequential patter...

متن کامل

Closed Regular Pattern Mining Using Vertical Format

Discovering interesting patterns in transactional databases is often a challenging area by the length of patterns and number of transactions in data mining, which is prohibitively expensive in both time and space. Closed itemset mining is introduced from traditional frequent pattern mining and having its own importance in data mining applications. Recently, regular itemset mining gained lot of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004